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1  Executive  Summary 

With  funding  from  the  Defense  Experimental  Program  to  Stimulate  Competitive  Research 
(DEPSCoR),  the  Laboratory  of  Applied  Computational  Electromagnetics  was  created  within 
the  MRC  Institute  at  the  University  of  Idaho.  The  purpose  of  this  funding,  as  stated  in 
the  proposal  “Development  of  a  Characteristic-Based,  Finite-Volume  Method  for  Patch 
Antennas,”  was  to  conduct  research  in  the  development  and  test  of  time-domain  algorithms 
for  Maxwell’s  equations.  Specifically,  the  main  objectives  of  this  research  endeavor  are  as 
follows: 

1.  To  develop  finite-volume  numerical  codes  for  the  square,  circular  and  elliptical  patch 

antennas. 

2.  To  determine  the  optimal  grid  and  grid  density  for  these  geometries. 

3.  To  develop  appropriate  feed  models  so  that  input  impedance  data  can  be  ascertained. 

4.  To  incorporate  higher-order  algorithms  as  they  are  being  developed  and  tested  at  the 

Flight  Dynamics  Directorate,  WPAFB. 

5.  To  investigate  the  use  of  finite-difference  schemes  on  a  general  curvilinear  frame. 

6.  To  numerically  compute  the  effects  of  finite-size  ground  planes  and  superstrate  dielectrics 

on  the  radiation  and  impedance  characteristics  of  the  antennas. 

7.  To  publish  research  results  in  archival,  peer-reviewed  journals. 

As  described  herein,  most  of  these  objectives  have  been  fully  or  partially  satisfied.  Specifi¬ 
cally,  the  following  accomplishments  have  been  realized: 

1.  A  curvilinear,  three-dimensional,  finite-volume,  time-domain  solver  has  been  built,  ex¬ 

tensively  tested  and  applied  to  the  patch  antenna  problem. 

2.  Analytical  studies  on  the  effect  of  grid  density  on  solution  accuracy  have  been  conducted. 

3.  Several  different  feed  models  have  been  devised  and  tested  for  the  excitation  of  the  patch 

antenna. 

4.  A  dedicated  effort  in  devising  and  coding  stable,  high-order  finite-difference  schemes  has 

been  applied. 

5.  Numerous  papers  have  been  submitted  to,  accepted  by  or  printed  in  international  refer- 

reed  journals  and/or  conferences. 

Note:  Objective  five  has  not  been  satisfied.  Our  finite-difference  solver  is  still  being  devel¬ 
oped  for  Cartesian-based  grids.  As  for  objective  six,  the  finite-volume  code  is  capable  of 
conducting  the  stated  objective;  the  numerical  investigation  was  not  completed. 


2 


2  Measurable  Accomplishments 

Tangible  productivity  metrics  of  the  Principal  Investigator’s  (PI)  team  are  defined  by  pub¬ 
lication  output  and  code  development  activities. 

2.1  Publications 

The  following  manuscripts  were  submitted  to  or  published  in  peer-reviewed,  refereed  journals 
during  the  course  of  this  contract: 

1  J.  F.  Nystrom  and  J.  L.  Young,  “k- space  transfer  function  design  of  discrete  operators: 

Application  to  Maxwell’s  time-domain  equations,”  Journal  of  Electromagnetic  Waves 
&  Applications,  vol.  13,  pp.  781-806,  1999. 

2  J.  L.  Young,  “The  design  of  high-order,  leap-frog  integrators  for  Maxwell’s  equations,” 

IEEE  Trans.  Antennas  &  Propagation  (Submitted). 

3  J.  F.  Nystrom,  “High-order,  time-stable  numerical  boundary  condition  scheme  for  the 

temporally  dependent  Maxwell  equations  in  two-dimensions”,  Journal  of  Computa¬ 
tional  Physics  (Submitted). 

4  J.  L.  Young,  R.  Nelson  and  D.  V.  Gaitonde,  “A  Detailed  examination  of  the  finite-volume, 

time-domain  method  for  Maxwell’s  equations.”  Journal  of  Electromagnetic  Waves  & 
Applications  (Submitted). 

The  following  manuscripts  were  presented  at  international  conferences: 

5  J.  F.  Nystrom  and  J.  L.  Young,  “/c-space  transfer  function  design  of  discrete  operators:  Ap¬ 

plication  to  Maxwell’s  time-domain  equations,”  37th  AIAA  Aerospace  Science  Meeting 
and  Exhibit,  AIAA  99-337,  Reno,  NV,  1999. 

6  J.  L.  Young,  “The  design  of  high-order  leap-frog  integrators  for  Maxwell’s  equations,” 

IEEE  Antennas  and  Propagation  International  Symposium  and  USNC/URSI  National 
Radio  Science  Meeting ,  Orlando,  FL.,  pp.  176-179,  1999. 

7  J.  F.  Nystrom  and  J.  L.  Young,  “High-order,  finite-difference  procedure  for  the  tempo¬ 

rally  dependent  Maxwell’s  equations,”  IEEE  Antennas  and  Propagation  International 
Symposium  and  USNC/URSI  National  Radio  Science  Meeting,  Atlanta,  GA.,  1998. 

8  J.  L.  Young  and  J.  F.  Nystrom,  “Designing  high-order,  time-domain  numerical  solvers 

for  Maxwell’s  equations,”  IEEE  Antennas  and  Propagation  International  Symposium 
and  USNC/URSI  National  Radio  Science  Meeting,  Atlanta,  GA.,  pp.  546-549,  1998. 
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The  following  papers  resulted  from  a  joint  research  collaboration  between  the  PI  and  the 
scientists  in  the  Air  Force  Research  Laboratory  (also  see  item  4): 

9  D.  Gaitonde,  J.  Shang  and  J.  L.  Young,  “Practical  aspects  of  higher-order  numerical 

schemes  for  wave  propagation  phenomena,”  Inti.  J.  Num.  Methods  in  Engr .,  vol.  45, 
pp.  1849-1869,  1999. 

10  D.  Gaitonde,  J.  S.  Shang  and  J.  L.  Young,  “Practical  aspects  of  high-order  accurate 

finite-volume  schemes  for  electromagnetics,”  35th  AIAA  Aerospace  Science  Meeting 
and  Exhibit,  AIAA  97-0363,  Reno,  NV,  1997. 

The  following  research  endeavor  resulted  from  a  collaboration  with  Prof.  Dennis  Sullivan 
of  the  University  of  Idaho,  recipient  of  the  U.S.  Army  Research  Office  DEPSCoR  grant 
DAAH04-96-1-0406: 

11  D.  Sullivan  and  J.  L.  Young,  “Far-field  time-domain  calculation  from  aperture  radiators 

using  FDTD  method,”  IEEE  Trans.  Antennas  and  Propagation  (Submitted). 

In  addition  to  the  stated  research  objectives  of  the  contract,  the  PI  also  engaged  in  the  au¬ 
thoring  of  one  invited  publication  that  is  relevant  to  the  general  electromagnetic  community: 

12  R.  G.  Olsen,.  J.  L.  Young  and  D.  C.  Chang,  “Electromagnetic  wave  propagation  on  a 

thin  wire  above  earth,”  IEEE  Trans.  Antennas  and  Propagation. 

The  last  entry  was  an  invited  submission  for  the  special  transactions  issue  in  honor  of  Prof. 
James  Wait;  co-authors  Olsen  and  Chang  are  fellows  of  the  IEEE. 


2.2  Numerical  Solvers 

Another  measurable  achievement  of  this  contract  is  the  development  of  five  solvers:  3WB1- 
RK4  FVTD  solver,  D4-RK4  FDTD  solver,  CC4-RK4  solver,  C4-LF4  solver  and  Yee  FDTD 
solver.  A  synopsis  of  each  of  these  solvers  and  some  representative  results  obtained  from 
them  is  given  below.  For  technical  and  mathematical  details  associated  with  these  solvers, 
consult  the  publication  list  under  section  2.1  and  the  appropriate  appendices  attached  to 
this  report. 


2.2.1  3WB1-RK4  FVTD  Solver 


The  finite-volume,  time-domain  (FVTD)  solver  is  based  upon  a  conservation  equation  of 
the  following  form: 


where  U°  is  the  average  value  of  the  solution  and  n  x  F  is  the  flux  crossing  the  surface  S. 
(See  Appendix  A  for  a  comprehensive  mathematical  treatment  of  the  scheme.)  The  vectors 
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U“  and  F  are  cast  in  terms  of  E  and  H  (the  electric  and  magnetic  field  vectors,  respectively) 
such  that  the  previous  equation  is  a  restatement  of  Maxwell’s  curl  equations  in  global  form. 

To  accomplish  the  spatial  discretization,  a  primitive  vector  is  invoked  and  cast  in  terms 
of  the  average  value  Ua.  The  derivative  of  this  vector  is  proportional  to  the  surface  integral 
of  the  right-hand  side  of  Eqn.  (1).  Thus,  the  degree  of  spatial  accuracy  is  directly  related 
to  the  accuracy  of  the  difference  operator  that  approximates  this  derivative.  The  final  result 
is  a  reconstructed  value  of  F  in  terms  of  the  average  value  of  U.  One  possible  operator  is  the 
second-order  central  difference  operator  (i.e.,  2C).  Although  second-order  differences  lead 
to  a  consistent  scheme,  they  also  lead  to  grid-decoupled  solutions.  Instead,  we  adopt  the 
notion  of  upwind  differencing,  which  was  first  pioneered  by  the  computational  fluid  dynamics 
(CFD)  community.  To  apply  upwind  differencing  and  to  obtain  a  stable  scheme  requires  one 
to  examine  the  eigenvalues  and  eigenvectors  of  the  flux  Jacobian  matrices  associated  with 
Maxwell’s  equations.  Based  upon  the  sign  of  these  eigenvalues,  which  indicate  the  direction 
of  wave  propagation  along  a  principle  coordinate  axis,  forward  or  backward  differences  are 
invoked.  A  third-order  spatially  accurate  scheme  is  realizable  if  windward-biased  stencils 
(i.e.,  3WB1)  are  employed.  Since  the  solution  is  discretized  within  its  domain  of  influence, 
the  semi-discrete  equations  are  both  consistent  and  grid-coupled. 

To  advance  the  equations,  the  four-stage  Runge-Kutta  integrator  is  invoked,  which  is 
denoted  as  RK4.  A  stability  analysis  of  the  combined  RK4-3WB1  scheme  reveals  that  a 
Fourier-stable  scheme  is  achievable  for  CFL  values  less  than  1.74.  This  same  analysis  also 
provides  information  on  the  scheme’s  dissipation  and  dispersion  errors,  which,  as  with  all 
upwind  schemes,  is  dissipation  dominant.  The  effect  of  this  dissipation  is  made  manifest 
in  the  data  associated  with  the  patch  antenna  simulation  (to  be  described  in  a  subsequent 
paragraph). 

Finally,  to  truncate  the  computational  domain,  the  incoming  tangential  flux  is  set  to  zero. 
This  type  of  absorbing  boundary  condition  is  first-order  accurate  since  it  is  based  upon  a 
one-dimensional  wave  equation.  However,  several  numerical  tests  confirm  its  adequacy  as 
domain  truncation  operator. 

Based  upon  the  previous  description,  a  code  was  constructed  and  thoroughly  tested.  For 
representative  results  obtained  from  this  methodology,  first  consider  a  simple  problem  of 
a  gaussian  pulse  in  a  one-dimensional,  perfectly  conducting  cavity.  The  form  of  the  pulse 
is  e~w  *  ,  where  w  =  4.14  x  1010  1/s;  this  value  for  w  yields  a  frequency  spread  in  excess 
of  20  GHz.  For  the  simulation  parameters,  let  8t  =  1.179  ps  and  8X  =  0.0005  m.  The 
pulse  is  assumed  to  be  traveling  at  the  speed  of  light.  To  ascertain  both  the  dissipative  and 
dispersive  effects  on  the  pulse,  we  marched  the  pulse  11,314  time  steps,  which  is  the  required 
time  for  the  pulse  to  attenuate  by  ten  percent  from  its  initial  value;  see  Figure  1.  The  slight 
ripple  effect  seen  in  the  plot  of  Figure  1  is  to  be  expected  due  to  dissipation,  dispersion 
and  boundary  related  errors.  However,  since  the  pulse  has  traveled  twenty  round  trips,  such 
effects  are  certainly  anticipated. 

To  validate  the  dielectric  boundary  condition  methodology,  the  problem  of  a  gaussian 
pulse  within  a  partially  filled  dielectric  cavity  is  considered;  the  dielectric  has  a  relative 
permittivity  value  of  2.3  and  fills  the  first  fifty  cells  of  the  200  cell  cavity.  The  pertinent 
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Cell  Number 

Figure  1:  Electric  field  inside  perfectly  conducting  cavity. 

parameters  are  the  same  as  in  the  previous  example.  The  initial  pulse  is  centered  at  cell  100 
and  is  traveling  to  the  left  as  time  advances.  Representative  results  after  230  time  steps  are 
provided  in  Figure  2.  As  seen  directly  from  the  figure,  the  theoretical  transmission  coefficient 
of  -0.795  and  the  theoretical  reflection  coefficient  of  -0.205  are  predicted. 

The  one-dimensional  problems  clearly  demonstrate  the  validity  of  the  FVTD  theory. 
The  aforementioned  examples  demonstrate  both  the  long-term  stability  of  the  simulation 
and  the  validity  of  the  designed  boundary  operators  for  perfectly  conducting  and  dielectric 
interfaces.  For  three-dimensional  curvilinear  domains,  the  results  are  not  as  promising,  as 
discussed  next. 

The  propagation  of  an  electromagnetic  wave  in  a  cylindrical  waveguide  of  radius  1  m  is 
the  subject  of  the  following  discussion;  the  waveguide  is  filled  with  free-space.  The  domain 
spans  10  x  20  x  100  cells  in  p,  <f>  and  z,  respectively.  The  domain  is  assumed  to  be  periodic  in  z 
and  the  electromagnetic  wave  is  initialized  with  the  TEn  time-harmonic  mode  of  frequency 
u  =  2n  r/s.  In  the  radial  direction,  the  signal  is  sampled  at  twenty  points  per  wavelength. 
For  this  study,  let  c  =  1  m/s,  Sp  =  0.1  m,  5$  =  27r/20  r,  Sz  =  5A*/100  =  0.05229  m 
and  6t  =  0.01269  s.  Figure  3  shows  the  magnetic  field  distribution  among  various  planes 
within  the  guide.  This  figure  serves  as  a  qualitative  check  by  showing  that  the  mode  has 
been  properly  set  up  within  the  guide.  Next,  consider  Figures  4  and  5,  which  show  all  six 
components  of  the  electromagnetic  signal  as  a  function  of  radial  cell  distance  at  time  steps 
200  and  2000,  respectively.  Clearly,  the  solver  is  able  to  predict  the  basic  field  phenomenology 
within  the  guide.  Upon  closer  examination  of  the  data,  however,  a  small  effect  of  numerical 
dissipation  in  the  early  time  and  large  dissipative  effects  in  the  late  time  are  observed.  Note: 
After  2000  times  steps,  the  total  energy  in  the  electromagnetic  wave  has  decreased  by  a 
factor  of  three.  The  highly  dissipative  nature  of  the  computed  solution  is  consistent  with 
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Figure  2:  Electric  field  inside  a  partially  filled  dielectric  cavity. 

the  theoretical  performance  of  windward  schemes.  A  second  effect  is  also  made  manifest  in 
the  data:  A  non-zero  longitudinal  electric  field.  The  longitudinal  component  is  indeed  small 
relative  to  its  transverse  counterparts,  which  implies  that  solver  has  set  up  a  quasi-TE  mode 
rather  than  a  pure  TE  mode.  The  generation  of  this  non-physical  component  is  due  to  the 
grid  and  its  corresponding  metrics.  Since  each  cell  is  not  a  parallelepiped,  the  numerically 
computed  metrics  are  only  second-order  accurate  (for  parallelepiped  cells,  the  metrics  are 
computed  exactly.)  Hence,  the  nonlinear  terms  of  the  metrics  are  not  captured  and  the 
corresponding  solution  is  corrupted  by  metric-induced  errors. 

A  microstrip  transmission  line  connected  to  a  patch  antenna  supported  by  a  dielectric  of 
relative  permittivity  of  2.2  is  considered  next;  see  Figure  6.  The  same  parameters  suggested 
in  Sheen  et  al.  [1]  and  Luebbers  et  al.  [2]  are  also  used  here.  Specifically,  let  5X  =  .389  mm, 
Sy  =  .400  mm  and  Sz  =  .265  mm;  to  insure  stability,  let  St  =  .555  ps.  The  patch  antenna 
spans  32  cells  by  40  cells;  the  patch  height  is  3  cells.  The  transmission  line  is  chosen  to  be 
20  cells  long  with  the  terminal  plane  placed  in  the  middle  of  the  line.  Allowing  for  some 
buffering  between  the  absorbing  boundary  condition  and  the  antenna,  we  choose  the  domain 
size  to  be  88  x  111  x  50  cells.  The  voltage  excitation  is  gaussian  and  of  the  form  e-™2*2, 
where  w  =  4.14  x  1010  1/s;  this  value  for  w  yields  a  frequency  spread  in  excess  of  20  GHz.  A 
50  Ohm  standard  is  adopted  for  this  simulation.  To  capture  both  low-  and  high-frequency 
effects,  which  are  manifested  in  both  the  late  and  early  time  response,  the  discrete  equations 
are  marched  5000  times.  Representative  field  contour  plots  above  and  below  the  patch  are 
shown  in  Figures  7  and  8,  respectively.  Numerous  plots  such  as  these  have  been  analyzed 
in  detail;  the  conclusion  of  that  analysis  is  that  the  solver  is  predicting  the  correct  wave 
behavior  and  capturing  the  correct  boundary  conditions. 

Figure  9  shows  the  calculated  input  reflection  coefficient  associated  with  two  solvers:  1) 
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Figure  3:  The  magnetic  field  within  the  cylindrical  waveguide. 


Figure  4:  The  field  components  of  E  and  H  within  the  cylindrical  waveguide  at  t  =  200<5t. 
The  dotted  lines  are  the  analytical  solution;  the  solid  lines  are  the  computed  solution. 
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Figure  5:  The  field  components  of  E  and  H  within  the  cylindrical  waveguide  at  t  —  2, 0005t. 
The  dotted  lines  are  the  analytical  solution;  the  solid  lines  are  the  computed  solution. 


Figure  6:  A  layout  of  the  patch  antenna  under  consideration. 
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Figure  7:  A  representative  field  contour  plot  above  the  patch. 


Figure  8:  A  representative  field  contour  plot  below  the  patch. 
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Luebbers 

Sheen 

FDTD 

FVTD-1 

FVTD-2 

Measured 

7.51 

7.39 

7.37 

7.13 

7.28 

7.39 

10.04 

9.78 

9.79 

9.31 

9.64 

9.85 

12.13 

12.09 

11.88 

11.72 

11.99 

N/A 

14.66 

14.48 

14.08 

13.72 

14.13 

14.63 

17.91 

18.06 

17.49 

N/A 

17.81 

18.43 

19.78 

19.85 

19.25 

N/A 

N/A 

20.00 

Table  1:  Resonant  frequencies  in  GHz  for  the  patch  antenna  of  Figure  2. 


the  popular  Yee  FDTD  solver  and  2)  the  3WB1-RK4  FVTD  solver.  Here  FVTD-1  and 
FDTD  have  identical  numerical  parameters.  FVTD-2  uses  a  more  refined  grid  and  temporal 
step  size  in  which  Sx  =  .265  mm,  8y  =  .265  mm,  Sz  =  .0795  mm  and  St  =  .2114  ps;  the 
domain  size  is  111  x  154  x  80.  As  Table  1  suggests,  both  solvers  predict  the  same  resonant 
(or  radiation)  frequencies  of  the  antenna.  These  frequencies  also  compare  well  with  the 
published  resonant  frequencies  of  Sheen  et  al.  [1]  and  Luebbers  [2]  et  al..  However,  it  is 
also  clear  from  the  reflection  coefficient  data  shown  in  Figure  9  that  the  data  generated  by 
the  better  of  the  two  finite-volume  schemes  (i.e.,  FVTD-2)  is  anywhere  from  2  to  4  dB  more 
dissipative  than  the  data  generated  by  the  Yee  scheme  (and  the  data  measured  by  Sheen). 
This  observation  is  expected,  due  to  the  highly  dissipative  nature  of  the  up-wind  schemes, 
and  is  corroborated  by  the  data  associated  with  the  circular  waveguide  study. 

To  further  emphasize  the  dissipative  nature  of  the  upwind  approach,  we  consider  a  50  D 
microstrip  transmission  line.  The  line  is  8  cm  long  and  0.233  cm  wide;  the  strip  is  placed  on 
top  of  a  grounded  dielectric  slab  with  a  relative  permittivity  of  2.2  and  a  thickness  of  .795 
mm.  For  the  computational  parameters,  let  8X  —  .389  mm,  Sy  =  .400  mm,  5Z  =  .265  mm 
and  5t  =  .555  ps;  the  computational  domain  spans  46  x  240  x  30  cells.  The  microstrip  is 
excited  in  a  similar  fashion  as  the  microstrip  associated  with  the  patch  antenna  study.  To 
monitor  the  propagation  of  the  wave  on  the  line  as  a  function  of  space,  three  voltage  and 
three  current  probes  were  placed  at  2,  4  and  6  cm  away  from  the  source.  The  numerical  data 
gathered  by  these  three  probes  is  plotted  in  Figure  10.  In  this  figure  we  see  the  expected 
numerically  generated  dissipation.  However,  the  veracity  of  the  data  is  confirmed  by  noting 
that  the  voltage  and  current  are  proportional  to  the  characteristic  impedance  of  the  line. 
Since  both  voltage  and  current  dissipate  at  the  same  rate  down  the  line,  the  ratio  of  50  Ohms 
is  not  impacted  by  numerical  dissipation. 

More  disconcerting  than  the  dissipation  errors  of  the  FVTD  solver  is  the  solver’s  poor 
computational  efficiency.  To  create  the  data  for  the  patch  antenna  study,  FVTD-2  took 
approximately  10.0  hours  of  Cray  T90  CPU  time  and  108  MW  of  core  memory.  In  contrast, 
the  FDTD  Yee  solver,  took  only  0.05  hours  of  Cray  T90  CPU  time  and  5.9  MW  of  core 
memory.  Even  if  some  additional  effort  were  applied  to  try  to  improve  the  computational 
through-put  of  the  FVTD  solver  by  a  factor  of  ten,  it  is  clear  that,  for  microwave  planar 
circuit  applications,  the  upwind  MUSCL  approach  is  inferior. 

Since  the  3WB1-RK4  finite-volume  scheme  is  nodally  collocated  and  temporally  syn- 
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Frequency  (GHz) 


Figure  9:  Reflection  coefficient  data  associated  with  the  patch  antenna. 


Figure  10:  Voltage  and  current  probe  data  associated  with  the  50  Ohm  microstrip  transmis¬ 
sion  line. 
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Figure  11:  Near-field  electromagnetic  field  data  about  a  line  source.  The  striping  of  the 
data  suggests  strong  grid  decoupling. 


chronized,  several  algorithmic  issues  arise.  The  first  issue  is  the  application  of  boundary 
conditions  at  a  perfectly  conducting  interface.  In  this  situation,  the  scheme  requires  that 
both  tangential  E  and  H  be  specified  on  the  boundary.  However,  due  to  the  fact  that  tan¬ 
gential  E  satisfies  a  Dirichlet  condition  and  tangential  H  satisfies  a  Neumann  condition, 
the  boundary  conditions  associated  with  this  scheme  are  over-specified.  Hence,  an  incorrect 
estimate  for  tangential  H  can  produce  errors  in  the  data  or  induced  instabilities  in  the  sim¬ 
ulation.  Second,  even  though  upwind  schemes  are  well  known  to  prevent  grid  decoupling, 
grid  decoupling  can  be  induced  via  the  Runge-Kutta  integrator.  Since  the  Runge-Kutta 
integrator  requires  that  both  field  components  be  advanced  simultaneously,  the  excitation 
of  one  field  component  is  not  made  manifest  to  the  other  field  component  at  the  instant  the 
excitation  is  applied.  This  effect  is  most  noticeable  when  modeling  volumetric  and  sheet 
sources;  the  effect  is  observed  near  the  source,  where  the  data  extremely  alternates  between 
positive  and  negative  values,  as  shown  in  Figure  11. 

2.2.2  D4-R.K4  FDTD  Solver 

The  fourth-order,  finite-difference,  time-domain  solver,  labeled  D4-RK4  FDTD,  is  based 
upon  a  procedure  that  embeds  strict,  time-stable  theorems  into  the  design  of  high-order 
boundary  operators.  To  realize  a  time-stable,  fourth-order  (in  space  and  time)  algorithm, 
a  standard  fourth-order  Runge-Kutta  (RK4)  integrator  is  used  to  temporally  advance  the 
equations;  the  spatial  stencils  and  boundary  operators  are  designed  using  summation-by¬ 
parts  techniques  based  upon  a  summation-by-parts  energy  norm. 

To  develop  the  theory,  the  case  of  z-directed,  one-dimensional  propagation  is  couched 
in  terms  of  a  2  x  2  hyperbolic  system:  wt  =  A  wz,  where  wT  =  [ExcBy\  is  the  solution 
vector  and  wz  is  the  partial  derivative  of  the  solution  vector  with  respect  to  the  z-direction. 
For  the  specific  problem  of  propagation  in  a  perfectly  conducting  cavity,  the  tested  spatial 
discretizations  utilize  a  fourth-order,  restricted,  full-norm  operator  (RF4)  or  a  fourth-order 
diagonal  norm  operator  (D4).  For  the  one-dimensional  scalar  avection  equation,  the  RF4 


operator  has  the  form 
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where  the  coefficients  qij  are  given  in  the  literature  by  Strand  [4].  The  coefficients  in  Eqn. 
(2)  encompass  the  transition  zone  between  a  third-order  windward  stencil  (i.e.,  3W)  on 
the  boundary  and  a  fourth-order  central  operator  (i.e.,  4C)  in  the  interior.  Furthermore, 
these  coefficients  are  designed  so  that  the  spatial  discretization  satisfies  a  summation-by¬ 
parts  energy  norm.  To  introduce  the  physical  boundary  conditions  and  to  yield  time-stable 
simulations,  both  test  cases  (i.e.,  RF4  or  D4)  utilize  the  orthogonal  projection  method  of 
Olsson  [5]. 

For  two-dimensional  TE2  mode  propagation  in  the  xy-plane,  the  Maxwell  curl  equations 
are  couched  in  terms  of  a  3x3  hyperbolic  system:  ut  =  Px  ux+Py  uy ,  where  uT  =  [Ex  Ey  cBz ]. 
The  previous  system  is  applicable  near  and  on  planar  boundaries  but  not  on  corners,  where 
an  ambiguity  exists  for  the  outwardly  pointing  normal.  To  remove  the  ambiguity,  the  system 
of  equations  is  modified  and  reformulated  into  a  new  scattered-field  initial  value  problem  of 
the  form  ut  =  P  Pxux  +  P  Pyuy  +  (I  —  P)  gt ,  where  g  incorporates  the  boundary  condition. 
Specifically, 
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The  previous  equation  is  valid  at  the  corner  of  a  PEC  scatterer  with  a  normal  defined  as 
n  =  (±l/\/2,  ±l/\/2). 

For  two-dimensional  simulations,  the  D4  stencil  is  employed  to  obtain  a  semi-discrete 
approximation  of  the  3x3  hyperbolic  system  representing  electromagnetic  fields  TE  to  2.  To 
properly  handle  the  domain  truncation,  the  perfectly  matched  layer  of  Zhao  and  Cangellaris 
[6]  is  used.  All  simulations  for  TEZ  mode  propagation  show  time-stable  results.  To  test 
the  efficacy  of  the  D4-RK4  scheme,  a  sample  scattering  geometry  is  chosen  and  results  from 
the  D4-RK4  scheme  are  compared  to  those  obtained  from  our  Yee  FDTD  solver.  Long-time 
integrations  are  performed  for  both  schemes  to  produce  surface  current  data,  as  shown  in 
Figure  12,  for  a  TEZ  pulse  obliquely  incident  on  a  2.5A  x  10A  cylinder.  The  convergence 
of  the  two  separate  techniques  to  the  same  answer  provides  verification  for  results  obtained 
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Perimeter  Distance 

Figure  12:  Yee  and  D4  calculated  surface  currents  for  obliquely  incident  TEZ  wave  on  a 
2.5A  x  10A  cylinder. 

from  the  fully  discrete  D4-RK4  scheme.  A  representative  contour  plot  of  the  electric  field  is 
shown  in  Figure  13;  the  scatterer  in  this  case  is  a  10A  square  cylinder. 

Efforts  to  expand  this  theory  to  three-dimensional  geometries  are  underway.  Currently, 
the  three-dimensional  version  of  this  code  does  not  yield  data  that  is  time  stable.  We  surmise 
that  the  boundary  operators  for  a  corner  of  a  three-dimensional  scatterer  are  ill-posed  and 
hence,  need  to  be  reexamined.  An  alternative  to  developing  new  boundary  operators  is  to 
develop  spatial  filters  that  dampen  out  high-frequency  instabilities.  (This  technique  has 
been  successfully  applied  by  Drs.  Shang  and  Gaitonde  of  the  Air  Force  Wright  Laboratory.) 
A  second  option  is  to  invoke  high-order  differences  associated  with  the  Yee  grid  and  the 
newly  developed,  fourth-order,  leap-frog  integrator.  Preliminary  test  cases  for  this  method 
have  been  implemented  in  code;  the  results  of  this  effort  are  encouraging.  The  development 
of  a  stable,  high-order,  three-dimensional  finite  difference  code  will  continue  into  the  year 
2000. 

2.2.3  CC4-RK4  Solver 

The  CC4-RK4  solver  utilizes  a  collocated  grid,  fourth-order  central  compact  differencing 
and  four  stage  Runge-Kutta  integration.  Details  associated  with  this  algorithm  and  code, 
along  with  representative  data,  are  available  in  the  following  reprint: 

J.  F.  Nystrom  and  J.  L.  Young,  uk- space  transfer  function  design  of  discrete 
operators:  Application  to  Maxwell’s  time-domain  equations,”  Journal  of  Elec¬ 
tromagnetic  Waves  &  Applications,  vol.  13,  pp.  781-806,  1999. 


15 


Figure  13:  A  representative  contour  plot  of  the  electric  field  associated  with  a  10A  square 
cylinder. 

2.2.4  C4-LF4  Solver 

The  C4-LF4  solver  utilizes  the  Yee  grid  in  conjunction  with  the  newly  developed,  high- 
order,  leap-frog  integrator;  consult  Appendix  B  for  technical  and  mathematical  details.  In 
principle  any  even-order  accurate  spatial  stencil  of  either  the  implicit  or  explicit  kind  may 
be  invoked.  However,  for  purposes  of  boundary  condition  implementation,  we  invoke  fourth- 
order  explicit  central  differences.  Since  little  is  to  be  gained  by  integrating  to  a  higher-order 
of  accuracy  than  that  of  the  spatial  operators,  the  leap-frog  integrator  is  limited  to  fourth- 
order  as  well. 

The  validation  of  the  numerical  theory  is  accomplished  by  considering  wave  propagation 
in  a  rectangular  waveguide.  Due  to  the  existence  of  an  exact  analytical  solution,  this  problem 
allows  us  to  conduct  precise  numerical  experiments  on  a  three-dimensional  domain  structure. 
Furthermore,  this  structure  requires  the  algorithm  to  employ  hard  boundary  conditions  on 
the  field  vectors  at  the  walls  of  the  waveguide.  For  the  spatially  second-order  discretization, 
no  special  treatment  is  required  to  enforce  the  requisite  boundary  conditions.  With  respect 
to  the  spatially  fourth-order  discretization,  the  cross-sectional  domain  is  extended  using 
image  fields.  For  Dirichlet  conditions,  the  image  fields  are  180  degrees  out  of  phase  with 
the  true  fields.  For  Neumann  conditions,  the  image  fields  and  the  true  fields  are  identical  in 
value. 

In  all  cases  below,  the  waveguide’s  cross-sectional  dimensions  are  a  =  b  =  1  m  and 
the  excitation  angular  frequency  is  2n  r/s;  the  speed  of  light  is  normalized  to  1  m/s.  For 
these  parameters,  the  TMn  mode  is  free  to  propagate  and  hence,  it  is  the  mode  under 
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Figure  14:  A  grid  density  study  of  the  2x2  and  the  4x4  schemes. 

consideration.  To  avoid  any  spurious  artifacts  from  the  source  or  from  the  termination,  the 
waveguide’s  length  is  chosen  to  be  exactly  5AZ,  where  for  the  TMn  mode,  Xz  =  \/2  m.  In 
so  doing,  we  create  a  waveguide  that  is  periodic  in  the  ^-direction  and  hence,  we  enforce  a 
periodic  boundary  condition  on  the  source/terminating  planes.  The  field  is  initialized  with 
the  exact  solution  throughout  the  entire  waveguide. 

The  first  experiment  verifies  that  the  codes  are  performing  according  to  theory.  To  this 
end,  a  grid  density  study  on  the  2x2  scheme  (i.e.,  the  popular  FDTD  scheme)  and  the  4x4 
scheme  is  conducted.  The  signal  in  the  waveguide  is  sampled  at  5,  10  and  15  points  per 
free-space  wavelength  in  all  three  directions  and  the  time  step  is  set  to  one  half  the  value 
of  the  cell  size.  In  each  experiment,  the  total  number  of  time  steps  is  adjusted  such  that 
the  total  simulation  time  is  the  same,  which  is  arbitrarily  chosen  to  be  25  seconds.  The 
standard  Li  error  norm,  averaged  over  Xz,  is  calculated  at  the  end  of  each  simulation  and 
the  logarithm  thereof  is  plotted  as  a  function  of  the  logarithm  of  the  cell  size.  The  resulting 
data  is  shown  in  Figure  14.  The  slope  of  these  curves  is  the  numerical  accuracy  order  of  the 
scheme.  For  the  2x2  scheme,  the  end-point  to  end-point  slope  is  1.8;  for  the  4x4  scheme 
the  end-point  to  end-point  slope  is  4.2.  These  values  are  close  enough  to  the  values  of  two 
and  four,  respectively,  to  conclude  that  the  codes  are  a  true  realization  of  the  2x2  and  4x4 
algorithms.  Since  the  2x4  and  4x2  codes  are  subsets  of  the  2x2  and  4x4  codes,  no  further 
validation  is  necessary. 

Typical  plots  of  the  exact  field  (i.e.  Hz)  and  the  computed  fields  using  the  2x2  and  the 
4x4  algorithms  are  shown  Figure  15.  For  these  plots,  5X  =  Sy  =  dz  =  0.1  m  and  8t  =  0.05  s. 
For  a  frequency  of  27t  radians,  the  field  is  sampled  at  10  points  per  free-space  wavelength. 
The  total  simulation  time  is  500(5* .  In  Figure  15,  the  anticipated  result  of  significant  phase 
degradation  for  the  2x2  scheme  is  obvious;  the  phase  error  is  19.2  degrees.  As  for  the  4x4 
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Figure  15:  Plots  of  Hz  at  t  =  5005t:  8X  =  .lm  and  St  —  .5 5X. 

scheme,  the  phase  error  is  visually  imperceptible;  its  value  is  only  1.0  degrees.  Both  schemes 
show  no  attenuation  of  the  signal,  thereby  substantiating  the  claim  that  the  2x2  and  the  4x4 
schemes  are  dissipationless. 

To  appreciate  the  effects  of  phase  error  accumulation  as  a  function  of  time,  Figure  16  is 
provided.  From  this  figure,  the  slope  associated  with  the  2x2  scheme  is  19.2  times  that  of  the 
4x4  scheme.  This  dramatic  improvement  in  phase  error  accumulation  further  substantiates 
the  claim  that  the  4x4  scheme  may  be  used  to  propagate  the  signal  significantly  farther  than 
the  2x2  scheme.  For  the  present  scenario,  the  group  energy  in  the  signal  has  propagated 
17.7  free-space  wavelengths.  The  4x4  scheme  can  thus  propagate  the  signal  339.8  free-space 
wavelengths  for  a  duration  of  9, 600 5t‘,  at  the  end  of  the  simulation  its  phase  error  would  also 
be  19.2  degrees. 

Another  interesting  feature  of  Figure  16  is  the  poor  phase  performance  of  the  4x2  and 
2x4  schemes.  Both  of  the  schemes  have  poorer  phase  performance  relative  to  the  2x2  scheme. 
This  result  for  the  4x2  scheme  is  in  agreement  with  the  numerical  theory.  As  for  the  2x4 
scheme,  its  poor  phase  error  performance,  relative  to  the  phase  error  of  the  2x2  scheme  may 
in  part  be  related  to  boundary  condition  induced  errors. 

2.2.5  Yee  FDTD  Solver 

To  establish  bench-marking  data  for  all  of  the  aforementioned  solvers,  the  finite-difference, 
time-domain  scheme  of  Yee  and  the  perfectly  match  layer  concept  for  open  domains  was 
also  implemented  into  numerical  code.  The  procedures  for  doing  so  are  well-documented  in 
the  literature  and  hence,  no  detailed  description  is  needed  here.  This  solver  was  devised  and 
tested  on  several  microstrip  planar  geometries,  including  the  patch  antenna,  as  confirmed 
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Figure  17:  A  layout  of  the  band-pass  filter  under  consideration.  The  substrate  thickness  is 
1.272  mm  and  its  relative  permittivity  is  9.8. 

in  section  2.2.1.  For  further  confirmation,  consider  the  band-pass  filter  configuration  of 
Figure  17.  This  geometry  was  programmed  into  the  solver  as  a  sequence  of  metallic  patches, 
which  implies  that  any  geometry  made  up  from  an  indefinite  number  of  metallic  patches 
(attached  or  otherwise)  could  be  similarly  programmed  into  code.  The  source  consists  of  an 
ideal  voltage  source  in  series  with  a  50  Ohm  resistor;  both  are  modeled  using  quasi-static 
arguments.  The  transmission  line  is  terminated  with  a  50  Ohm  load.  Letting  Sx  =  0.212mm, 
Sy  =  0.212mm,  5Z  =  0.212mm  and  St  =  .353ps,  and  running  the  code  for  20,000  time  steps, 
we  recorded  the  total  voltage  at  the  input  and  output  ports  as  a  function  of  time.  These 
values  were  then  transformed  into  the  frequency  domain,  from  which  the  Sn  parameters  were 
calculated.  The  results  of  this  calculation  are  shown  in  Figure  18  along  with  the  calculated 
results  of  Shibata  et  al.  [3].  As  expected,  the  agreement  between  data  sets  is  quite  good. 
The  confirmation  between  independently  generated  data  sets  confirms  the  bench-marking 
capability  of  this  solver. 
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Figure  18:  Sn  data  associated  with  the  band-pass  filter. 

3  Challenges 

During  the  course  of  the  contract,  several  challenges  arose.  The  most  significant  ones  are 
outlined  below. 

3.1  Non-Technical  Challenges 

There  was  one  non-technical  challenge  that  arose  at  the  beginning  of  the  contract:  Grad¬ 
uate  student  recruitment.  Due  to  tight  national  labor  markets  and  low  enrollments  in  the 
university’s  graduate  program,  it  was  difficult  to  identify,  recruit  and  hire  quality  gradu¬ 
ate  students.  In  the  end,  the  PI  was  able  to  hire  two  outstanding  students.  However,  one 
was  hired  in  January  1997  and  the  other  in  January  1998;  thus,  the  initial  progress  on  this 
contract  was  greatly  hindered. 

3.2  3WB1-RK4  FVTD  Solver:  Technical  Challenges 

In  the  development  of  the  3WB1-RK4  FVTD  solver,  several  technical  challenges  were  pre¬ 
sented  to  us  over  the  course  of  the  contract.  These  are  briefly  articulated  below. 

Formulation  At  the  beginning  of  the  contract,  we  adopted  the  standard,  fluid  dynamics, 
conservative  form  formulation  for  Maxwell’s  equations.  Although  mathematically  cor¬ 
rect,  this  form  masks  many  of  the  physical  attributes  of  the  electromagnetic  field.  Also, 
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from  a  software  coding  point  of  view,  this  form  requires  many  redundant  and  unnec¬ 
essary  computations.  Recognizing  this,  we  abandoned  this  approach  and  reformulated 
the  finite-volume  procedure  to  manifest  the  key  physical  features  of  the  electromag¬ 
netic  field;  see  Appendix  A  for  technical  details.  The  final  result  was  a  more  efficient 
and  robust  code  than  the  one  that  was  developed  by  the  PI  several  years  earlier. 

Boundary  Conditions  The  nodally  collocated  scheme  places  all  components  of  the  electric 
and  magnetic  field  vectors  at  one  location.  Hence,  there  exists  ambiguity  regarding 
how  to  best  satisfy  the  requisite  boundary  conditions.  However,  once  we  adopted  our 
new  formulation,  this  ambiguity  was  easily  resolved. 

Grid  Decoupling  With  the  hopes  of  developing  a  highly  efficient  scheme,  we  initially 
adopted  a  second-order  central  difference  reconstruction.  However,  we  eventually  dis¬ 
covered  that  such  an  approach  led  to  grid  decoupling,  which  is  most  observable  when 
the  field  is  excited  by  a  line  source.  To  prevent  decoupling,  we  adopted  a  split-flux 
formulation,  which  resolved  the  problem.  Unfortunately,  grid  decoupling  can  also  oc¬ 
cur  irrespective  of  the  reconstruction;  it  can  also  be  introduced  by  the  Runge-Kutta 
integrator.  This  latter  challenge  has  not  been  fully  resolved  at  the  time  of  this  writing. 

Dissipation  Extensive  numerical  tests  on  one-  and  two-dimensional  geometries  were  con¬ 
ducted  to  verify  the  methodology  and  the  code  implementation.  These  tests  predicted 
the  expected  result  and  demonstrated  the  anticipated  numerical  errors.  However,  when 
the  code  was  applied  to  the  microstrip  patch  problem  and  other  three-dimensional  ge¬ 
ometries,  significant  dissipation  was  introduced  into  the  solution.  All  tests  conducted 
so  far  indicate  that  this  is  exclusively  an  algorithm  related  error  associated  with  upwind 
schemes  rather  than  a  software  coding  error. 

3.3  D4-RK4  Solver:  Technical  Challenges 

In  the  development  of  the  D4-RK4  FDTD  solver,  several  technical  challenges  presented 

themselves  over  the  course  of  the  contract.  These  are  briefly  articulated  below. 

Domain  Truncation  Given  the  desirability  of  producing  highly  accurate  schemes,  the 
truncation  of  the  open  domain  became  an  important  issue  to  address.  The  decision 
was  made  to  adopt  a  perfectly  matched  layer  (PML)  approach.  Significant  effort  was 
applied  to  deducing  a  methodology  to  implement  the  PML  methodology  into  the  RK4 
integrator.  The  final  outcome  was  a  precise  domain  truncation  scheme. 

Stability  It  is  well  known  that  high-order,  three-dimensional,  time-stable  schemes  are 
difficult  to  design  and  implement.  To  address  this  fact,  two  choices  were  presented 
to  us:  1)  to  implement  a  spatial  filter  to  dampen  out  the  high-frequency  instabilities, 
or  2)  to  use  a  summation-by-parts  formulation  and  design  a  priori  stable  boundary 
operators.  Due  to  the  costly  computational  burden  of  the  former,  we  chose  the  latter. 
Some  progress  has  been  made  in  the  design  of  time-stable  solvers.  Specifically,  this 
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research  is  the  first  to  devise  a  theory  and  implement  that  theory  for  two-dimensional 
solvers.  Of  course,  more  work  is  needed  to  devise  a  methodology  for  problems  that 
span  three  dimensions. 

Grid  Decoupling  As  with  the  finite-volume  solver,  the  issue  of  grid  decoupling  is  of  con¬ 
cern  in  this  work  as  well. 

3.4  CC4-RK4  Solver:  Technical  Challenges 

The  technical  challenges  associated  with  the  CC4-RK4  solver  are  essentially  the  same  as 
those  for  the  D4-RK4  solver;  the  reader  is  referred  to  section  3.3  for  that  discussion. 

3.5  C4-LF4  Solver:  Technical  Challenges 

The  C4-LF4  solver  has  the  most  promise  of  maturing  into  a  solver  for  mainstream  use. 
Since  the  scheme  is  entirely  central  and  staggered  in  time  and  space,  grid  decoupling  issues 
are  not  encountered.  Even  though  the  dominant  error  mechanism  is  dispersion,  rather  than 
dissipation,  numerical  case  studies  indicate  that  this  error  mechanism  can  be  mitigated 
with  judicial  settings  of  the  numerical  parameters.  This  leaves  only  boundary  condition 
implementation  as  the  major  technical  issue  to  be  resolved.  This  issue  is  briefly  discussed 
below: 

Boundary  Conditions  One  solution  that  has  been  considered  and  tested  is  the  use  of 
compact  differences  [7].  Here,  we  can  set  the  boundary  conditions  for  the  magnetic  field 
exactly.  However,  for  the  electric  field,  the  use  of  image  terms  or  one-side  differences 
need  to  be  invoked  near  the  boundary.  It  is  not  clear  at  this  time  if  such  a  solution  will 
yield  a  stable  scheme  for  all  problems  of  interest.  A  second  approach  is  to  use  low- 
order  differences  in  the  direction  where  the  field  is  slowly  varying  in  space.  For  example, 
for  the  simulation  of  the  electromagnetic  field  associated  with  planar  circuits,  it  is  well 
known  that  the  electric  and  magnetic  field  slowly  varies  in  the  transverse  plane  relative 
to  the  propagation  direction.  Hence,  for  this  application,  a  second-central  difference 
can  be  applied  in  the  transverse  plane  and  high-order  central  differences  can  be  applied 
in  the  direction  of  propagation.  The  resulting  scheme  would  indeed  be  low-order,  but 
highly  accurate. 

3.6  Yee  FDTD  Solver:  Technical  Challenges 

The  Yee  FDTD  solver  is  based  on  a  very  mature  foundation  of  research  knowledge.  Hence, 
no  real  technical  challenges  arose  during  the  development  of  the  numerical  code.  Further 
work  is  necessary  to  improve  the  code’s  robustness  and  efficiency. 
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Model 

CPU 

RAM 

Hard  Drive 

HP  C180 

180  MHz 

384  MB 

4  GB 

HP  C180 

180  MHz 

128  MB 

4  GB 

HP  C110 

110  MHz 

128  MB 

4  GB 

Table  2:  Hewlett-Packard  workstations. 


Product 

Description 

Mathematica 

Symbolic  manipulator  package 

Field  View 

Data  post-processing  colorization  plotting  package 

Table  3:  Software  tools. 

4  Graduate  Students 

A  large  portion  of  the  contractual  funding  was  dedicated  to  the  employment  of  two  graduate 
students:  Mr.  Ronald  Nelson  and  Mr.  James  Nystrom.  Mr.  Nelson  was  hired  in  January 
1998  and  is  anticipated  to  complete  his  Masters  of  Science  in  Electrical  Engineering  in 
December  1999.  Mr.  Nelson’s  primary  task  was  to  develop  and  test  the  finite-volume,  time- 
domain  solver.  Mr.  Nystrom  was  hired  in  January  1997  and  is  anticipated  to  complete  his 
PhD  in  Electrical  Engineering  in  May  2000.  Mr.  Nystrom’s  primary  task  was  to  develop 
and  test  the  high-order  finite-difference  solver.  Mr.  Nystrom  also  played  an  key  role  in  the 
development  of  the  k- space  design  method  of  discrete  operators. 

5  Interactions  with  AFRL  Personnel 

The  obligations  set  forth  in  the  Pi’s  proposal  were  performed  separate  from,  but  in  con¬ 
sultation  with,  the  scientists  and  engineers  of  the  Air  Force  Research  Laboratory  (AFRL), 
Air  Vehicles  Directorate.  The  principle  points  of  contact  were  Dr.  Joseph  Shang  (Senior 
Scientist)  and  Dr.  Datta  Gaitonde  (Research  Engineer).  Over  the  past  three  years,  three 
publications  have  borne  the  names  of  these  individuals  and  the  (PI);  one  manuscript  is 
currently  being  considered  for  publication. 

6  Facilities 

In  addition  to  salaries,  contractual  funding  was  also  used  to  acquire  equipment  and  software; 
see  Table  2  for  a  list  of  the  major  equipment  purchases  and  Table  3  for  software  purchases. 
Due  to  these  acquisitions,  the  Laboratory  of  Applied  Computational  Electromagnetics  is 
now  able  to  conduct  internationally  competitive  research. 
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7  Web  Page 

Computer  animations  showing  the  evolution  of  the  electromagnetic  field  associated  with  the 
microstrip  patch  antenna,  microstrip  transmission  line,  the  rectangular  cylindrical  scatter, 
as  well  as  other  geometries,  can  be  found  at  the  following  world  wide  web  address: 

http:// www.  mrc.  uidaho.edu /mrc /people/nystrom  /  antennas_l  .htrnl . 

These  color  animations  were  created  using  the  post-processing  software  tool,  Field  View, 
which  was  procured  with  contract  funds.  Not  only  do  such  animations  provide  valuable  in¬ 
sight  into  the  phenomenology  of  the  electromagnetic  field,  they  are  also  valuable  in  detecting 
code  induced  errors  in  the  data  sets. 


8  Future  Work 

As  mentioned  earlier,  the  development  and  test  of  the  finite-volume  and  the  finite-difference 
solvers  will  be  continued  by  the  two  graduate  students  Nelson  and  Nystrom.  It  is  estimated 
that  this  development  and  testing  process  will  continue  into  December  1999  for  Mr.  Nelson 
and  into  May  2000  for  Mr.  Nystrom.  No  funding  will  be  solicited  from  the  DEPSCoR 
program  office  to  fund  these  tasks. 

In  addition  to  these  two  endeavors,  efforts  are  also  underway  to  improve  the  efficiency, 
usability  and  accuracy  of  our  Yee  solver.  Based  upon  current  research,  we  believe  that  the 
solution  accuracy  of  the  solver  can  be  improved  by  incorporating  the  PML  of  Zhao  et  al. 

With  this  DEPSCoR  grant,  the  PI  has  established  the  necessary  infrastructure  to  operate 
a  nationally  competitive  research  program.  To  maintain  this  program  and  to  further  advance 
the  state  of  knowledge  in  highly-accurate  algorithm  development,  the  PI  has  submitted  a 
follow-up  proposal  to  the  DEPSCoR  program  office.  Due  to  the  many  technical  deficiencies  of 
the  nodally  collocated  schemes  (i.e.,  dissipation  errors,  grid  decoupling,  boundary  condition 
implementation,  code  inefficiencies,  etc.),  this  new  proposal  emphasizes  the  development 
of  highly-accurate,  nodally  uncollocated  schemes  and  leap-frog  integration.  Both  of  these 
concepts  have  been  examined  in  detail  over  the  past  several  years  but  have  not  been  applied 
to  problems  of  practical  interest.  For  the  microwave  circuits  and  antenna  engineer,  such 
problems  would  include  radiation  from  an  array  of  planar  patch  antennas  or  the  propagation 
of  an  electromagnetic  signal  through  a  microstrip  circuit.  For  problems  of  this  nature  that 
are  electrically  large,  both  highly  accurate  and  highly  efficient  numerical  solvers  are  needed. 
We  believe  this  new  thrust  has  the  promise  of  achieving  both. 
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A  Appendix:  Finite— Volume  Procedure 

It  is  well  known  that  the  mathematical  description  of  the  electromagnetic  field  is  given  as  a 
solution  to  the  two  curl  and  two  divergence  equations  of  Maxwell  [8].  In  the  ensuing  subsec¬ 
tions,  a  short  description  of  these  equations  is  provided  and  methods  for  their  discretization 
are  examined  in  detail  in  the  context  of  the  finite-volume  procedure. 

Letting  E  and  H  denote  the  vector  symbols  of  the  electric  and  magnetic  fields,  respec¬ 
tively,  Maxwell  postulated  that 

|i®+VxB  =  0,  (4) 


5E 

e—  -  V  x  H  =  0, 


(5) 


V  •  E  =  0, 

(6) 

and 

V  •  H  =  0. 

(7) 

The  previous  description  assumes  that  the  space  under  consideration  is  devoid  of  sources.  In 
addition,  the  assumptions  that  the  permittivity  e  and  the  permeability  //  are  scalars  and  of 
constant  value  are  made.  For  purposes  of  numerical  computation,  the  two  curl  equations  are 
discretized  and  advanced  in  time;  the  two  divergence  equations  are  additional  constraints 
imposed  upon  the  fields. 

An  alternate  form  of  Maxwell’s  equations  that  has  significance  to  certain  discretization 
operators  is  obtained  by  adding  and  subtracting  like  terms  within  the  two  curl  equations: 

3H 

+  V  x  F+  +  V  x  Fe-  =  0 

(8) 

- v  * _  v  x  F™ = °’ 

(9) 

where 

P;  =  (E-ipx  H)/2, 

(10) 

F,:  —  (E  +  r/n  X  H)/2, 

(11) 

(12) 

and 

f™=Kh-“D- 

(13) 

In  Eqns.  (10)— (13) ,  n  is  an  arbitrary  unit  vector  yet  to  be  specified, 
and  F+  +  F“  =  H,  as  required.  More  importantly,  for  a  plane 

Obviously,  F+  +  F“  =  E 
wave  traveling  in  the  n 

direction,  Fe  and  Fm  are  identically  zero,  whereas  F+  and  F+  are  not.  Likewise,  for  a  plane 
wave  traveling  in  the  negative  n  direction,  F+  and  F+  are  identically  zero,  whereas  Ft  and 


26 


F~  are  not.  Thus,  it  can  be  argued  that  the  role  of  F+,F+,Fj  and  F“  is  to  discriminate 
between  a  plane  wave  traveling  in  the  positive  or  negative  n  directions. 

The  global  form  of  the  governing  equations  can  be  obtained  by  integrating  the  point-form 
equations  over  some  volume  V.  The  volume  is  enclosed  by  the  surface  5,  which  is  defined  by 
the  outward  pointing  normal  n.  Integrating  over  V  and  invoking  the  vector  curl  theorem, 
we  obtain 

dEa  1  r 
~  V  JsU 


and 


'  dt 
dHa 


xH  dS 


(14) 


=  1/ 
VJs 


E  x  n  dS. 


(15) 


^  dt  v  js 

Here  E°  and  H°  are  the  average  values  of  E  and  H,  respectively,  within  the  volume  V.  That 
is, 

Ea  =  -  [  EdV  (16) 

V  Jv 

and 


H  dV. 


(17) 


A  chief  attribute  of  both  Eqns.  (14)  and  (15)  is  the  exclusive  manifestation  of  tangential 
field  components  on  the  volume  surface.  This  attribute  is  compatible  with  the  boundary 
conditions  for  E  and  H  [8]. 

The  integration  of  Eqns.  (8)  and  (9)  throughout  the  volume  V  is  accomplished  in  the 
exact  same  manner  as  in  the  previous  section.  The  result  is 


dEa 

dt 


=  i/snxFirfS  +  i/s„ 


xF m  dS 


and 


3H°  1  r  .  1  /*„_ 

^=vJsFtxildS+vJsF;Xn  dS' 


(18) 


(19) 


The  unit  vector  n  in  Eqns.  (10)-(13)  is  now  defined  to  be  the  vector  normal  to  the  surface 
S. 

Given  the  similarity  of  Eqns.  (14),  (15),  (18),  and  (19),  we  consider  the  following  canon¬ 
ical  integral  equation  as  the  basis  for  the  ensuing  mathematical  development: 


aua 

dt 


-U' 


x  F  dS 


(20) 


where  Ua  denotes  either  eE°  or  ^Ha;  F  denotes  either  — E,  H,  — F+  —  F“  or  F+  +  F" , 
depending  upon  the  context.  Also,  in  anticipation  of  the  numerical  procedure,  the  volume 
V  is  assumed  to  be  comprised  of  six  faces  Si,  S2,  *  •  • ,  Se\  the  sum  of  these  faces  is  the  closed 
surface  S.  For  a  six-sided  volume,  Eqn.  (20)  reduces  to 


<9Ua  1 
dt  ~  Vfti  1 


(21) 
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where 


(22) 


1 1  =  ii/  x  F  dS ; 

Js, 

here  n /  is  the  outward-pointing  normal  to  5/. 

The  evaluation  of  the  integrals  in  Eqn.  (21)  is  facilitated  by  establishing  a  curvilinear 
transformation  that  maps  V  into  V7,  the  transformed  region  of  integration.  The  mapping  is 
accomplished  through  the  following  equations:  £  =  £(x,  y,  z),  r]  =  t](x,  y,  z)  and  £  =  £(x,  y,  z). 
By  invoking  the  Jacobian  J,  we  compute  the  average  value  via 

Ua  =  i  J  V'  J  <2£<fyd£;  (23) 

here  U'(£, »?,  £,  t)  =  U(x(£,  rj, £),  y(£,  tj ,  £),  z(£,  i h  £),  <)• 

Next  consider  the  integration  over  the  six  faces.  The  surfaces  Si,  S 2,  S3,  S4,  S5  and  ^6  are 
defined  by  £  =  £1  (£1  is  a  constant),  £  =  £2,  V  =  f?i,  W  =  772,  £  =  Ci  and  £  =  £2,  respectively. 
One  of  these  surfaces,  say  S2  and  its  integral  I2,  is  the  focus  of  the  following  development; 
the  remaining  five  surface  integrals  are  evaluated  in  a  similar  fashion.  Considering  Eqn.  (22) 
for  l  =  2,  we  obtain,  through  a  change  of  variables  the  equation, 

I2  =  /  n2  x  F'(£2,  r 7,  £,  t)  J\ V£|d77d£,  (24) 

Js2 


where  F'(£, *?,£,*)  =  F(x(£, r], £),?/(£, 7?, £),z(£, 77, £),<)  and  S2  is  the  transformed  region 
associated  with  S2.  Since  V£  is  perpendicular  to  surfaces  of  constant  £,  n2|V£j  =  V£ 
and 

I2=  /  V£xF'(£2,??,£,t)Jdr?d£.  (25) 

J  S2 

One  approach  towards  evaluating  I2  from  known  cell  average  values  consists  of  introducing 
a  primitive  vector  W,  whose  derivative  is  /2  [9].  Stated  mathematically, 


dW2  = 
5£  " 


(26) 


Primitive  vectors  for  the  other  surface  integrals  are  defined  in  a  similar  manner.  It  is  apparent 
from  the  above  equation  that  value  of  the  surface  integral  hinges  upon  the  mechanism  for 
computing  the  primitive  vector  and  its  derivative. 

Values  for  the  primitive  vector  are  deduced  by  considering  a  more  general  definition  for 
its  derivative  than  the  one  provided  by  Eqn.  (26).  Let 


aw(£) 


=  L  V£xF'(£,*7,£W£. 

J  Sf(t  j 


d£  Js'(S) 

From  this  definition,  it  follows  that  5'(£2)  =  S'2  and 


SW(£) 


'5£ 


dW2 
dt  ‘ 


(27) 


(28) 
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Then,  by  means  of  Riemann  integration  of  Eqn.  (27), 

W(f)  =  W(ft)  +  /'/,.  v«  x  V(T,n,QJ<hvKdT.  (29) 

•'fi  •'S,(t) 

The  evaluation  of  this  integral  at  C  =  £2  yields 

W(6)  =  W(Ci)  +  /*  [  VC  x  F'(r,  77,  QJd^dr.  (30) 

Je. 1  Js’{t) 

By  definition  the  integration  term  is  proportional  to  the  average  value  of  V£  x  F  over  the 
volume  V  and  hence, 

W(&)  =  W(£i)  +  V(V£xF)B.  (31) 

The  antecedent  equation  fundamentally  states  that  the  exact  value  of  W  at  domain  walls 
can  be  determined  from  the  average  value  of  VC  x  F  within  that  domain. 

The  previous  equation  is  exact,  provided  that  the  metric  information  associated  with  V£ 
is  known  exactly.  To  compute  the  metrics,  a  mathematical  description  of  the  C  is  required. 
Consider  the  description  C  =  &+£«,  where  C;  and  £n  represent  the  linear  and  non-linear 
terms  of  C,  respectively.  Since  VC/  is  constant,  it  follows  from  Eqn.  (31)  that 

W(6)  =  W(6)  +  EV&  x  F°  +  E(VC„  x  F)a.  (32) 

If  VCn  is  regarded  as  sufficiently  small  throughout  the  volume,  then 

W(6)  ~  W(6)  +  VVh  x  F“,  (33) 

which  is  a  second-order  approximation.  The  error  of  this  approximation  e  is  obtained  by 
the  mean  value  theorem  of  integral  calculus: 

e  =  F'(fo,  Tjo-,  Co)  Jvi  VtndtdrjdC,  (34) 

where  Co,  ??o,  Co  is  some  point  in  the  volume  V’.  For  volumes  that  are  parallelepipeds,  e  =  0. 

A  value  for  VC;  can  be  obtained  from  the  geometry  of  the  volume.  In  a  separate  note,  we 
show  that  VC  ~  <5{A£/V,  where  5$  is  the  distance  between  S\  and  S2  and  A£  is  the  average 
directed  surface  area  in  the  C  direction  within  V .  By  definition, 

Ai  =  T(£A{M-  <35> 


Therefore,  Eqn.  (33)  reduces  to 

W(C2)«W(Ci)  +  <5eA£xF°,  (36) 

Due  to  the  simplicity  of  the  Eqn.  (36),  the  code  development  for  the  primitive  vector  is 
trivial. 
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Figure  19:  A  pictorial  representation  of  the  indexing  scheme  used  in  the  flux  reconstruction. 


To  obtain  an  approximation  for  the  derivative  of  the  primitive  vector,  the  domain  V  is 
sub-divided  into  many  sub-volumes,  each  denoted  as  1^_i/2j-i/2,*:-i/2;  Eqn.  (21)  is  now  an 
equation  over  each  sub-domain.  The  six  surfaces  defining  each  sub-volume  are  still  denoted 
locally  as  Si  through  56.  Globally,  the  surfaces  Si  and  52  are  defined  by  the  equations 
£  =  i  and  £  =  &,  respectively.  Likewise,  rj  =  ty_x,  r\  =  rjj ,  C  =  Ofe-i  and  C  =  C fc  globally 
define  S3, S4, S5  and  S6,  respectively.  Using  this  notation,  we  cast  Eqn.  (31)  as  follows: 


w i  =  W i_i  +  Ui_i/2(V^  x  F)“_1/2,  1  <  i  <  /pts  (37) 


where  (V£  x  F)“_1//2  is  the  average  value  of  V£  x  F  in  the  volume  Vi_!/2  defined  by  the  grid 
lines  i  —  1  and  i.  (The  indices  j,  k  are  dropped  for  clarity;  see  Figure  19  for  a  one-dimensional 
depiction.)  Since  W  is  to  be  differentiated,  Wo  may  be  set  to  zero,  or  to  any  other  value. 

To  estimate  the  derivative  of  W  at  the  cell  surface  i  (i.e.,  the  value  of  I2),  we  simply 
employ  standard  difference  stencils.  For  example,  a  second-order  central  approximation  is 
given  by 


dW  (W,+i  —  W,_i) 
S5  ~(  26(  )■ 


(38) 


Substituting  Eqn.  (37)  in  Eqn.  (38),  we  obtain 


dW 


dt 


U»+i/2(V£  x  F)°+1/2  +  Ui-i/2(V£  x  F)?_1/a 
2 


Under  the  assumption  that  the  metrics  are  computed  to  second-order, 

awl 


'(A|  x  FQ)H1/2  +  (A|  x  Fa)j_1/2 
2 


)• 


(39) 


(40) 


where  (A£  x  Fa)i±1/2  is  the  average  value  of  in  volume  i  ±  1/2  crossed  into  the  average 
value  of  F  in  volume  i  ±  1/2.  Eqn.  (40)  is  valid  for  cells  of  unequal  size  or  for  stretched 
grids.  For  uniform  grids,  it  seems  reasonable  to  replace  Eqn.  (40)  with 


aw 

dZ 


»  A  jX 

i 


r  z+1/2  +  1/2 


(41) 
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Per  Eqn.  (26),  the  right  hand  side  of  Eqn.  (41)  is  an  estimated  value  of  the  surface  integral 
I2.  The  equation  states  that  the  value  of  the  surface  integral  is  reconstructed  from  average 
values  of  F. 

The  previous  equation  is  recognized  as  being  nothing  more  than  a  central  average  re¬ 
construction  and  is  one  of  many  reconstruction  formulas.  As  with  the  primitive  vector,  its 
computation  is  trivial.  Within  this  equation,  there  exists  two  approximations.  The  first 
is  associated  with  the  discretization  of  the  derivative  of  the  primitive  vector.  The  second 
is  associated  with  the  accuracy  of  the  metric  computation.  Both  of  these  approximations 
influence  the  overall  accuracy  of  the  reconstructed  value.  Instead  of  using  a  second-order 
central  derivative,  a  fourth-order  central  derivative  could  be  employed.  If  however,  the  met¬ 
ric  is  computed  to  second-order,  the  reconstructed  value  is  strictly  second-order.  For  cells 
that  are  near  parallelepipeds,  the  non-linear  metric  terms  are  negligible  and  the  accuracy  of 
the  reconstructed  value  is  strongly  influenced  by  the  accuracy  of  the  derivative  calculation. 
Thus,  even  if  the  scheme  is  not  strictly  high-order,  it  still  may  be  highly  accurate. 

One  difficulty  with  the  employment  of  central  differences  for  the  calculation  of  the  prim¬ 
itive  vector  derivative  (in  conjunction  with  nodally  collocated  schemes)  is  the  problem  of 
grid  decoupling  [10].  Since  a  central  operator  has  holes  in  its  stencil,  the  unknown  and  its 
derivative  at  a  node  are  decoupled  from  each  other.  For  point-source  excitation,  the  result 
of  this  effect  is  an  interlacement  of  zeros  in  the  data  [11].  To  overcome  this  problem,  we 
consider  a  split  flux  treatment,  which  discretizes  F+  and  F~,  rather  than  F  [12].  Since 
F  =  F+  +  F-,  it  follows  from  Eqn.  (37)  that  W  =  W+  +  W",  where 


and 


w,+  =  W£,  +  (V?  X  F+)“_1/2, 

1  ^  ^  ^  -^pts 

(42) 

w-  =  W-  ,  +  (V{  X  F-)?_,/2, 

1  <  i  ^  ^pts 

(43) 

The  differentiation  of  W+  and  W_  is  accomplished  by  establishing  difference  stencils  within 
the  domain  of  influence  of  these  two  vectors.  That  is,  for  the  positive-going  waves  (i.e.,  F+), 
backward  stencils  are  invoked;  for  the  negative-going  waves  (i.e.,  F-),  forward  stencils  are 
invoked.  For  example,  a  second-order  windward  approximation  yields 


aw+ 1  __  /3W f  -  mu + w+. 


and 


ae 

aw- 


{ 


2  sf 


ae 


-3W-  +  4W. 


i+1  -  W^2 


2  Sf 


(44) 


(45) 


For  second-order  metrics,  the  final  result,  as  obtained  from  the  previous  equations  and 
expressed  in  terms  of  F°,  is 


dW+ 

d£ 


Aj  x 


^(F+)t./2  -  (F+)?— 3/2 \  ' 


(46) 
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and 


dW~ 

dt 


A  i  x 


^3(F  )°+1/ 2  (F  )°+3/2^ 


(47) 


The  sum  of  these  two  derivatives  is  the  derivative  of  the  total  primitive  vector  (or,  the  value 
of  the  surface  integral  under  investigation.) 

A  table  of  commonly  used  stencils  that  may  be  invoked  for  the  derivative  of  W,  W+ 
or  W-  may  be  found  in  many  texts  (e.g.,  [13]).  Other  stencils  that  are  neither  central  or 
strictly  forward  (or  backward)  can  also  be  constructed  via  standard  Taylor  series  analysis. 
For  such  stencils,  we  denote  them  as  windward  biased.  In  the  numerical  examples  section, 
we  choose  to  adopt  the  third-order  windward  biased  stencil  (3WB1)  as  the  stencil  under 
investigation.  That  is,  let 


dW+,  _  /2W+ ,  +  3W+  -  6Wti  +  WjV 

dt 


and 


dW- 


dZ 


V 

-2W-,  - 3W~  +  6W~+1  -  W~+2 

6^ 


t±2^ 


(48) 

(49) 


Based  upon  stability  arguments  and  four-stage  Runge-Kutta  integration,  central  differ¬ 
ences  may  be  used  for  either  W,  W+  or  W~.  Windward  and  windward  biased  stencils  are 
typically  used  for  only  W+  or  W_. 

This  section  is  closed  by  noting  that  all  of  the  difference  stencils  discussed  so  far  are  of 
the  explicit  type.  An  alternative  to  explicit  differencing  is  implicit  differencing,  or  compact 
differencing.  For  example,  consider  the  following  central  compact  operator  [14,  15]: 


ttQt+i  +  Qi  +  o:Qt-i  =  /5(Wi+i  —  Wj_i).  (50) 

If  a  =  1/4  and  0  =  3/4,  then  Q j  is  approximately  the  derivative  of  W  at  z;  the  degree  of 
approximation  is  fourth-order.  Unlike  the  fourth-order  explicit  central  operator,  which  has 
a  stencil  that  spans  five  cells,  the  fourth-order  compact  operator  spans  only  three  cells.  This 
attribute  is  often  quoted  in  the  context  of  boundary  operator  implementation.  Although 
the  above  equation  represents  a  tri-diagonal  matrix  equation,  the  solution  of  that  equation 
is  easily  obtained  by  employing  the  Thomas  algorithm  [16].  The  number  of  operations 
associated  with  the  computation  of  Q,  is  five. 

In  addition  to  the  equations  that  govern  propagation,  Maxwell’s  equations  in  global  form 
require  that  tangential  E  and  H  be  continuous  across  a  dielectric  interface.  At  a  perfect 
electrical  conductor  defined  by  the  normal  n  there  exist  two  conditions  of  interest: 


n  x  E  =  0 

These  equations  are  equivalent  to 

Et  =  0 


n  x  V  x  H  =  0. 


dHt 

dn 


=  0, 


(51) 


(52) 
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where  the  subscript  t  denotes  the  tangential  component.  Although  other  boundary  conditions 
may  be  derived,  uniqueness  of  solution  is  assured  when  the  tangential  fields  are  specified 
on  the  boundary.  Moreover,  the  boundary  conditions  are  compatible  with  the  integrand 
of  the  surface  integrals,  which  highlights  an  important  feature  of  the  nodally  collocated 
finite-volume  procedure.  The  algorithmic  implementation  of  these  boundary  conditions  is 
considered  next. 

Suppose  that  i  =  0  (locally  defined  surface  Si)  resides  on  a  perfect  electric  conductor 
(PEC)  and  the  domain  is  specified  by  i  >  0.  If  the  integrand  of  Ii  is  a  function  of  tangential 
E,  then  Ii  =  0.  If  the  integrand  of  Ii  is  a  function  of  tangential  H,  then  from  the  identity 

w=vLn'xHdS’  (53) 

it  follows  that  a  Neumann  condition  is  implemented  by  setting 

d2Wj  „ 

~d£T  =  0  on  Sl-  (54) 

To  second-order,  this  condition  is  satisfied  by  employing  an  image  term  W^-l)  and  by 
setting  that  image  term  to  the  value  of  2Wi(0)  —  W(l).  With  this  image  term,  second-order 
central  (2C)  or  3WB1  operators  can  be  invoked  on  the  boundary;  second-order  windward 
(2W)  or  3WB1  operators  can  be  invoked  one  cell  in  from  the  boundary.  Note:  For  the  case 
when  3WB1  operators  are  used  in  the  interior  and  2C  operators  are  used  on  the  boundary, 
the  global  spatial  accuracy  is  still  third-order  [17]. 

For  the  situation  when  Si  resides  on  a  dielectric  interface,  at  least  two  possibilities  arise. 
The  fact  that  the  tangential  fields  are  continuous  across  the  interface  (but  a  discontinuous 
first  derivative)  allows  one  to  reconstruct  at  that  interface  with  second-order  central  averages; 
the  reconstruction  is  in  non-split  form.  In  split  form,  the  process  of  reconstruction  is  more 
complicated.  The  complication  arises  from  the  discontinuity  of  F,  due  to  the  tj  term.  To 
handle  this  discontinuity,  the  primitive  vectors  associated  with  E  and  H  are  built  as  before. 
However,  two  values  of  F  (i.e.,  two  values  of  F+  and  two  values  of  F")  must  be  deduced, 
depending  upon  the  cell  under  consideration.  To  the  left  of  Si,  F  is  a  function  of  rja‘,  this  F 
is  used  in  the  time  advancement  of  U?_1/2.  To  the  right  of  Si,  F  is  a  function  of  rjb ;  this  F 
is  used  in  the  time  advancement  of  U“+1/,2. 

The  truncation  of  the  domain  or  the  development  of  an  absorbing  boundary  condition 
(ABC)  can  be  accomplished  by  several  means.  The  perfectly  matched  layer  (PML)  is  one 
such  possibility  [6].  The  development  of  such  a  PML  for  high-order  FVTD  schemes  is  the 
subject  of  future  work.  In  this  work,  characteristic  theory  is  adopted  for  simplicity.  In  split 
flux  form,  we  have  already  noted  that  F+  and  F_  only  act  on  positive-  and  negative-going 
waves,  respectively.  Thus  a  suitable  ABC  can  be  devised  by  setting  either  F+  or  F_  to  zero, 
depending  upon  which  vector  represents  an  out-going  wave. 

Upon  completion  of  the  reconstruction  phase,  the  semi-discrete  equations  take  on  the 
form 

dn 

a  =  Au’  <55> 
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which  constitutes  a  constant  coefficient,  linear  system  of  ordinary  differential  equations. 
Here  u  represents  every  value  of  Ua  throughout  the  computational  domain  and  A  is  the 
connectivity  matrix.  Since  our  desire  is  to  obtain  high-order  accuracy,  the  classical  four- 
stage  scheme  is  invoked  and  labeled  as  RK4.  In  vector  form  and  in  terms  of  the  auxiliary 
vectors  kx  •  •  k4,  the  standard  RK4  scheme  is  as  follows  [18]: 

ki  =  St  Au" 
k2  =  StA  (un  +  ki/2) 
k3  =  St  A(un  +  k2/2) 
k4  =  5tA(un  +  k3) 

u""*"1  =  un  +  (kx  +  2k2  +  2k3  +  k4)/6.  (56) 


Other  techniques  for  low  storage  RK  integrators  are  reported  in  the  literature  [19,  20].  Upon 
the  incorporation  of  all  four  stages  into  the  equation  for  un+1,  the  final  result  is 


u 


n+1 


-( 


1  +  St  A  +  yA2 


Furthermore,  by  substituting  Eqn.  (55)  into  Eqn.  (57),  we  obtain 


u 


n+1  _ 


(,  .  a  s?  a2  s}  a3  si  a1  \ 

{1+s,ai  +  lw+Tap+T4dr*) 


u 


(57) 


which  is  the  expected  truncated  Taylor  series.  These  two  equations  reveal  the  main  sources 
of  error:  1)  Spatial  discretization  errors  incorporated  in  the  approximation  of  chi /dt  by  Au 
and  2)  temporal  errors  as  a  result  of  the  truncated  Taylor  series.  The  manifestations  of  these 
errors,  usually  couched  in  terms  of  dissipation  and  dispersion  errors,  are  best  characterized 
in  the  frequency  domain.  Since  such  an  analysis  has  been  provided  in  many  sources,  we  only 
note  that  the  predominant  error  mechanism  is  dissipation  [14]. 

One  final  note:  Even  though  upwind  schemes  are  well  known  to  prevent  grid  decoupling, 
grid  decoupling  can  be  induced  via  the  Runge-Kutta  integrator.  Since  the  Runge-Kutta 
integrator  requires  that  both  field  components  be  advanced  simultaneously,  the  excitation 
of  one  field  component  is  not  made  manifest  to  the  other  field  component  at  the  instant  the 
excitation  is  applied.  This  effect  is  most  noticeable  when  modeling  volumetric  and  sheet 
sources;  the  effect  is  observed  near  the  source,  where  the  data  extremely  alternates  between 
positive  and  negative  values. 


34 


B  Appendix:  High— Order  Leap-Frog  Integrator 


Consider  Maxwell’s  equations  for  a  lossless  isotropic  homogeneous  medium  that  is  described 
by  the  permittivity  and  permeability  parameters  e  and  //,  respectively;  the  medium  is  also 
devoid  of  sources: 

e?  =  VxH  (59) 


and 


dt 

dH  „  „ 

fi—  =  -V  X  E. 


(60) 


Here  E  and  H  are  the  electric  and  magnetic  intensities,  respectively. 

To  accomplish  the  spatial  discretization,  the  field  components  are  arranged  on  the  Yee 
grid  [21]  in  the  usual  manner.  Then,  for  each  derivative  that  appears  in  the  curl  operators, 
central  differences  are  invoked.  For  example,  letting  xp  represent  a  component  of  either  E  or 
H  and  letting  Sx  be  the  length  of  the  cell,  we  write  at  node  i, 


d xp 
dx 


-  <fe-V2  = 


Here  L2  denotes  the  second-order  order  discrete  difference  operator.  In  like  fashion, 


dxp 

dx 


t^i+l/2  ~  1/2 


'A+w-'h-v  A  =  LiAi 


246. 


(61) 


(62) 


which  is  a  fourth-order  approximation.  In  this  case,  the  fourth-order  discrete  difference 
operator  is  represented  by  L4.  Sixth-  and  eighth-order  approximations  can  be  easily  effected 
using  standard  Taylor  analysis. 

To  verify  the  claim  on  accuracy  order,  let  xp  =  xp0e~ikx,  where  xpa  is  a  constant  scalar.  In 
accordance  with  Fourier  theory,  the  transform  of  d/dx  is  —jk,  whereas  the  transforms  of  L2 
and  Z-4  are  denoted  as  —jK2  and  —jK^  respectively.  Here, 


K2  = 


(63) 


and 

K4  =  K2(1  +  62xK22/24).  (64) 

Now  as  6X  — >  0,  it  is  a  simple  matter  to  show  that  K2  — x  k  +  0(6 1)  and  K4  — »  k  +  0[S*),  as 
expected. 

Extending  these  concepts  to  the  vector  fields  of  Maxwell,  we  deduce  that  the  Fourier 
transform  of  Vx  is  —  jk,  where  k  =  kx ax  +  ky +  kza.z,  and  the  Fourier  transform  of  the 
discrete  version  of  Vx  is  —  jKq,  where  q  =  2  or  q  —  4,  depending  upon  the  accuracy  order. 
For  example, 


K2  = 


(65) 
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A  similar  representation  for  K4  is  constructed  using  equations  like  Eqn.  (64).  Thus,  as 
fix,  fiy,  fiz  all  uniformly  go  to  zero,  Kg  -+  k  +  0(fi%,  5%,  5\). 

We  now  seek  to  develop  an  integrator  based  upon  leap-frog  concepts  and  the  following 
expansions.  At  t  =  nfit , 


M  1 

jjn+1/2  _  jjn-1/2  +  g  V'  _ 
m=  1  (odd) 


Hn 

\2 )  dtm 


and  at  t  =  (n  +  1/2 )fit, 


M  -I 

En+1  _  En  +  2  V'  _!_ 

rT  ^  tn! 

m=l  (orfd) 


dmEn+1/2 

J  dt™ 


(66) 


(67) 


When  M  —  1,  the  popular  second-order  Yee  scheme  emerges.  That  is,  replacing  time 
derivatives  with  curl  operators  in  accordance  with  Eqns.  (59)  and  (60),  we  obtain 

jjjjn+1/2  =  nHn~l/2  -  5tV  x  E"  (68) 


and 

eEn+1  =  eEn  +  StV  x  Hn+1/2.  (69) 

When  M  ±  1,  high-order  temporal  derivatives  of  the  field  vectors  must  also  be  known. 
For  a  fourth-order  approximation,  Fang  [22]  converted  third-order  temporal  derivatives  into 
third-order  spatial  derivatives  via  repeated  application  of  Maxwell’s  equations.  For  example, 

__  =  _  (_j  v  x  V  x  V  x  H,  (70) 

where  c  =  1/^/JIe.  As  can  be  inferred  from  the  previous  expression,  this  conversion  pro¬ 
cess  leads  to  a  computational  algorithm  of  great  complexity.  For  in  this  expression  there 
exists  mixed  higher-order  derivatives  that  are  not  compatible  with  the  Yee  grid  or  with  the 
boundary  conditions.  Moreover,  if  one  seeks  to  develop  a  sixth-  or  eighth-order  integrator, 
the  complexity  of  the  algorithm  increases  even  more  dramatically.  From  a  memory  point 
of  view,  the  algorithm  does  not  require  any  additional  temporary  storage  vectors.  Hence, 
Fang  chooses  memory  parsimony  at  the  expense  of  algorithm  complexity.  Like  Yee’s  original 
scheme,  Fang’s  scheme  requires  a  total  of  six  unknowns  per  three-dimensional  cell. 

Instead  of  converting  temporal  derivatives  into  spatial  derivatives,  we  choose  to  adopt  an 
extended  leap-frog  methodology.  For  example,  a  fourth-order  integrator  is  constructed  as 
follows: 


Tx  =  -(^)VxE" 

T2  =  (fit/e)V  x  Tx 
T3  =  -(V/i)VxT2 

jjn+1/2  =  H"-1/2  +  Tx  +  T3/24.  (71) 
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rvl/2  |_|  f»*1/2 


Figure  20:  A  pictorial  representation  of  the  extended  fourth-order,  leap-frog  integrator 
(multiplicative  constants  omitted). 

Here  Ti,T2  and  T3  are  temporary  storage  vectors.  To  compute  the  curl  operators,  it  is 
understood  from  the  previous  development  that  Tx  is  spatially  proportional  to  H;  hence,  H 
and  Tx  are  spatially  registered  on  the  Yee  grid.  By  way  of  similar  comparison,  E  and  T2 
are  spatially  registered,  as  are  H  and  T3.  Now  for  the  next  half  time  step, 

Ti  =  (St/e)V  x  Hn+1/2 

T2  =  -(SMVxTt 
T3  =  (6t/e)V  x  T2 

En+1  =  En  +  Tx  +  T3/24.  (72) 

Here  the  pairs  (Tx,  E),  (T2,  H)  and  (T3,  E)  are  nodally  collocated.  A  pictorial  representation 
of  the  extended  fourth-order  leap-frog  integrator  is  shown  in  Figure  20. 

For  the  sake  of  simplicity,  we  have  included  the  temporary  vector  T3  in  the  algorithm. 
However,  in  code  it  is  not  necessary  since  the  calculations  for  T3  and  En+1  (or  T3  and 
Hn+i/2)  can  be  combined.  Hence,  the  present  approach  requires  twelve  unknowns  (i.e.,  three 
components  of  E,  H,  Tx  and  T2)  per  cell  for  a  three-dimensional  domain.  Whereas  for 
the  four-stage  Runge-Kutta  (RK4)  integrator,  twelve  to  twenty-four  unknowns  per  cell  are 
needed,  depending  upon  the  approach  [18].  (As  storage  requirements  lessen  for  each  of  these 
RK4  procedures,  the  operational  count  or  the  algorithm  complexity  may  increase.)  Thus, 
it  can  be  argued  that  the  aforementioned  leap-frog  integrator  expends  additional  memory 
(albeit,  a  small  amount  of  additional  memory)  for  the  sake  of  algorithm  simplicity.  For 
we  note  that  one  could  write  two  subroutines:  one  to  compute  the  curl  of  E  and  one  to 
compute  the  curl  of  H  (the  curl  is  computed  to  the  level  of  spatial  accuracy  desired;  more 
will  be  said  on  this  in  an  ensuing  discussion).  In  each  of  these  subroutines,  the  appropriate 
boundary  condition  information  is  included.  Then,  to  advance  the  equations,  one  needs  only 
to  repeatedly  call  each  of  these  routines,  one  after  the  other,  in  accordance  with  the  previous 
set  of  equations.  At  the  end  of  each  half-time  step,  the  results  are  combined  to  obtain  the 
next  time-value  of  the  vector  under  consideration. 

Given  the  simplicity  of  the  fourth-order  algorithm,  it  requires  little  inductive  reasoning 
to  formulate  the  eighth-order  extended  leap-frog  integrator.  The  final  algorithm  is 

Tx  =  ~(6MVxEn 
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Figure  21:  A  pictorial  representation  of  the  extended  eighth-order,  leap-frog  integrator 
(multiplicative  constants  omitted;  only  one  stage  is  shown). 

T2  =  (d*/e)V  x  Tx 

T3  =  —  (5t//x)VxT2 
T2  =  («yt/e)V  x  T3 
T5  =  -(5t//i)V  X  T2 
T2  =  (Ve)V  X  T5 
T7  =  -(5t/n)V  x  T2 

Hn+i/2  _  Hn_1/2  +  Tx  +  T3/24  +  Ts/1920  +  T7/322560.  (73) 

In  the  same  fashion, 

Ti  =  (6t/e)V  x  Hn+1/2 

T2  =  -(SMVxTt 
T3  =  (St/e)V  x  T2 
T2  =  —(6t/n)V  x  T3 
T5  =  («5t/c)V  x  T2 

T2  =  -(^)VxT5 
T7  =  (<yt/e)V  x  T2 

e«+i  =  En  +  Tx  +  T3/24  +  Tg/1920  +  T7/322560.  (74) 

A  pictorial  representation  for  this  integrator  is  shown  in  Figure  21.  (Only  one  stage  is  shown.) 
Again,  the  temporary  variable  T7  is  used  to  clarify  the  algorithmic  procedure;  in  code  it  is 
not  needed.  Hence,  for  a  three-dimensional  domain,  the  eighth-order  integrator  requires  18 
unknowns  per  cell  (i.e.,  E,H,Tx,T2,  T3  and  T5).  In  general,  the  Mth-even-order  leap-frog 
integrator  requires  3M/2  +  6  unknowns  per  cell  for  M  >  4. 

The  complete  numerical  scheme  is  assembled  by  choosing  the  accuracy  order  for  the 
temporal  and  spatial  discretizations.  Limiting  ourselves  to  fourth-order  accurate  schemes 
or  less,  the  following  schemes  will  be  considered:  2x2,  2x4,  4x2  and  4x4,  where  the  first 
digit  denotes  the  temporal  accuracy  and  the  second  digit  denotes  the  spatial  accuracy.  To 
ascertain  the  error  mechanisms  of  these  aforementioned  schemes,  a  time-harmonic  plane 
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wave  is  assumed  for  each  of  the  field  constituents.  That  is,  let  E  =  E0ejwte-jfk  x  and  H  = 
H0ejwte~jyc'x.  Then  for  E  and  H  to  be  solutions  to  the  discretized  Maxwell’s  equations,  for 
an  unbounded  medium,  we  must  require  that 

jf2eE0  =  -jKq  x  H0  -  jKp(c252/ 24)Kg  x  [Kq  x  (Kg  x  Hc)]  (75) 

and 

jttfj.Ho  =  jKq  x  E0  +  jKp(c2S2 /24)Kg  x  [K,  x  (Kg  x  E0)],  (76) 

where  q  —  2  for  a  spatially  second-order  accurate  scheme  (see  Eqn.  (65))  and  q  =  4  for  a 
spatially  fourth-order  accurate  scheme.  Here  kp  is  a  switch  whose  value  is  zero  for  p  =  2 
(i.e.,  temporally  second-order)  and  unity  for  p  =  4  (temporally  fourth-order).  Moreover, 

n=(|)sinfr)’  (77) 

which  reduces  to  u  as  5t  ->  0.  On  the  Cartesian  grid,  Kg  •  E  =  0  and  Kg  •  H  =  0.  Hence, 
Eqns.  (75)  and  (76)  reduce  to 

f2eE0  =  -[1  -  kp(c2S2 /2i){Kq  ■  Kg)]Kg  x  H0  (78) 

and 

ttfiHo  =  [1  -  kp(c2S2/ 24)(Kg  •  Kg)]Kg  x  E0.  (79) 

Further  manipulations  of  the  previous  equations  results  in  the  elimination  of  the  field  vectors. 
The  final  equation  is  given  by 

tt2  =  c2 Kq  ■  Kg[l  -  kp(c262/ 24)(Kg  •  Kg)]2,  (80) 

which  is  the  numerical  dispersion  relationship  for  the  pXq  scheme.  As  the  grid  size  and 
the  time-step  approach  a  value  of  zero,  Eqn.  (80)  reduces  to  its  analytical  counterpart: 
ui2  =  c2k  ■  k. 

Eqn.  (80)  contains  the  complete  dispersion,  dissipation  and  stability  properties  of  the 
schemes.  First  consider  the  stability  property.  Fourier  stability  is  assured  if  Im{w}  >  0, 
where  a;  is  a  solution  of  Eqn.  (80).  Thus,  seeking  such  solutions  in  a  one-dimensional 
domain  for  which  Im{u;}  >  0,  one  finds  that  6t  must  be  less  in  value  than  v8x/c,  where  v  is 
the  maximum  CFL  number  [23].  For  three-dimensional  domains,  the  requirement  on  St  is 
St  <  vSxj {y/% c),  provided  the  cells  are  cubic.  Table  4  lists  the  maximum  CFL  values  for  each 
scheme.  (These  values  have  also  been  confirmed  through  experimental  methods.)  Although 
spatial  accuracy  does  not  have  much  impact  on  the  CFL  number,  temporal  accuracy  does. 
From  Table  4,  we  observe  that  the  fourth-order  in  time  schemes  have  a  CFL  number  at  least 
2.4  times  that  of  the  second-order  in  time  schemes.  Moreover,  the  4x4  scheme  has  a  factor 
of  2.44  in  CFL  performance  over  Fang’s  scheme,  which  has  a  CFL  of  unity  [24].  Even  though 
the  4x4  scheme  and  Fang’s  scheme  are  based  upon  the  same  Taylor  series  expansion,  they 
are  not  the  same  scheme.  The  4x4  scheme  uses  exclusively  fourth-order  central  differences  to 
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Method 

V 

2x2 

1 

2x4 

6/7 

4x2 

2.847 

4x4 

2.441 

Table  4:  Maximum  CFL  number  for  each  scheme. 


calculate  the  first-order  spatial  derivatives;  Fang’s  scheme  uses  a  mixture  of  approximations 
for  the  various  spatial  derivatives  of  different  orders  that  appear  in  the  algorithm. 

Provided  that  the  time-step  is  selected  to  yield  stable  data,  the  dissipation  and  dispersion 
properties  of  each  scheme  can  also  be  ascertained  from  Eqn.  (80).  First,  it  is  easily  confirmed 
that  stable  solutions  of  Eqn.  (80)  are  purely  real,  thus  implying  that  all  four  schemes  are 
dissipationless.  Second,  the  computed  phase  per  time  step  of  the  signal  </>c  is  simply  (for  a 
one-dimensional  domain), 


(j)c  —  2  sin  1 


( Y )  /KfP  -  -%,(c%V24)K-|]2 


(81) 


whereas  the  exact  phase  of  the  signal,  <j)e  is  to5t  =  v5xk  =  2ttu/Np,  where  Np  is  the  number 
of  cells  per  wavelength.  By  definition,  the  phase  error,  A <f>,  is  (j>e  —  <f)c.  A  plot  of  A 4>  is 
shown  in  Figure  22  as  a  function  of  Np  for  the  2x2,  2x4, 4x2, 4x4  schemes,  when  v  =  0.5.  As 
expected,  the  4x4  scheme  has  the  best  phase  error  characteristics.  For  example,  for  a  phase 
error  of  .1  mr  (i.e.,  phase  error  of  5.7  degrees  after  1000  time  steps),  the  4x4  scheme  requires 
Np  =  12;  for  the  2x2  scheme,  Np  =  33.  Another  interesting  feature  of  Figure  22  is  the  poor 
phase  performance  of  the  4x2  scheme  relative  to  the  2x2  scheme.  Finally,  the  2x4  is  noted 
to  be  the  only  phase  leading  scheme. 

Now  that  the  phase  characteristics  are  known,  we  can  ascertain  the  overall  efficiency 
of  the  2x2  and  the  4x4  schemes  on  a  three-dimensional  domain,  sub-divided  into  cubic 
cells.  The  2x2  scheme  requires  30  computations  per  cell  per  time  step  and  the  4x4  scheme 
requires  174  computations  per  cell  per  time  step.  Hence,  for  a  computational  domain  of 
one  wavelength  on  a  side,  a  phase  error  of  .1  mr  per  time  step,  a  CFL  of  0.5  and  for 
a  propagation  distance  of  one  wavelength  along  a  major  coordinate  axis,  the  2x2  scheme 
requires  71.2  million  computations  (i.e.,  (30)(33)4/(.5))  and  .216  million  memory  units  (i.e., 
6(33)3);  the  4x4  scheme  requires  only  7.22  million  computations  (i.e.,  (174)(12)4/(.5))  and 
20.7  thousand  memory  units  (i.e.,  12(12)3).  Thus,  for  this  particular  application,  the  4x4 
scheme  is  roughly  ten  times  more  efficient  and  requires  ten  times  less  memory  than  the  2x2 
scheme.  Although  this  example  is  idealized  in  many  ways,  the  example  substantiates  the 
significant  value  of  using  the  4x4  scheme  over  the  2x2  scheme. 

It  is  interesting  to  note  that  if  v  -  1.3  (near  the  3D  maximum  value  of  the  4x4  scheme), 
the  phase  error  analysis  indicates  that  Np  =  15  is  required  for  .1  mr  of  phase  error  per  time 
step.  Duplicating  then  the  previous  example,  one  finds  that  6.77  million  computations  (i.e., 
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Points  Per  Wavelength 


Figure  22:  Phase  error  plots  for  various  schemes:  v  =  0.5. 


(174)(15)4/(1.3))  and  40.5  thousand  memory  units  (i.e.,  12(15)3)  are  needed  to  propagate 
the  signal  one  wavelength.  Based  upon  this  example,  there  appears  to  be  no  significant 
benefit  to  operating  the  4x4  scheme  at  higher  CFL. 
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